On the Entropy of Written Spanish

نویسنده

  • Fabio G. Guerrero
چکیده

This paper reports on results on the entropy of the Spanish language. They are based on an analysis of natural language for n-word symbols (n = 1 to 18), trigrams, digrams, and characters. The results obtained in this work are based on the analysis of twelve different literary works in Spanish, as well as a 279917 word news file provided by the Spanish press agency EFE. Entropy values are calculated by a direct method using computer processing and the probability law of large numbers. Three samples of artificial Spanish language produced by a first-order model software source are also analyzed and compared with natural Spanish language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Complexity measurement of natural and artificial languages

We compared entropy for texts written in natural languages (English, Spanish) and artificial languages (computer software) based on a simple expression for the entropy as a function of message length and specific word diversity. Code text written in artificial languages showed higher entropy than text of similar length expressed in natural languages. Spanish texts exhibit more symbolic diversit...

متن کامل

Quantifying Structure Differences in Literature Using Symbolic Diversity and Entropy Criteria

We measured entropy and symbolic diversity of texts written in English and Spanish. We included texts by Literature Nobel laureates and other famous authors. We formed four groups of texts according to the combinations of language used and the author's Literature Nobel Prize condition. Entropy, symbol diversity and symbol frequency profiles were compared for these four groups. We also built a s...

متن کامل

Phonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish

Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...

متن کامل

Phonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish

Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...

متن کامل

Automatic Recovery of Punctuation Marks and Capitalization Information for Iberian Languages

This paper shows experimental results concerning automatic enrichment of the speech recognition output with punctuation marks and capitalization information. The two tasks are treated as two classification problems, using a maximum entropy modeling approach. The approach is language independent as reinforced by experiments performed on Portuguese and Spanish Broadcast News corpora. The discrimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0901.4784  شماره 

صفحات  -

تاریخ انتشار 2009